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Abstract 

The estimation of differential energy loss for charged particles in tracker detectors is studied. The robust truncated 
mean method can be generalized to the linear combination of the energy deposit measurements. The optimized 
weights in case of arithmetic and geometric means are obtained using a detailed simulation. The results show better 
particle separation power for both semiconductor and gaseous detectors. 
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1. Introduction 

The identification of charged particles is crucial in several fields of particle and nuclear physics: particle spectra, 
correlations, selection of daughters of resonance decays and for reducing the background of rare physics processes 
|[l[|2l. Tracker detectors, both semiconductor and gaseous, can be employed for particle identification, or yield 
extraction in the statistical sense, by proper use of energy deposit measurements along the trajectory of the particle. 
While for gaseous detectors a wide momentum range is available, in semiconductors there is practically no logarithmic 
rise of differential energy loss (dEjdx) at high momentum, thus only momenta below the the minimum ionization 
region are accessible. In this work two representative materials, silicon and neon are studied. Energy loss of charged 
particles inside matter is a complicated process. For detailed theoretical model and several comparisons to measured 
data seeRefs. flU. 

While the energy lost and deposited differ, they will be used interchangeably in the discussion. It is also clear that 
the energy read out also varies due to noise and digitization effects. 

This article is organized as follows: Sec.|2]describes the microscopical energy loss simulation used in this study. 
Sec.|3]introduces the basic method of truncated mean, while Sec.|4]deals at length with the optimization of weighted 
arithmetic and geometric means. Possible handling of diff'erent path lengths is discussed in Sec. |5] Results of the 
simulation and applications of the optimized weighted means are shown in Sec. |6] The work ends with conclusions 
and it is supplemented by four Appendices with interesting results, such as optimal weights in case of few (App.lAli 
and many measurements (App. |B]i; some theoretical insights (App. |C]i; also on connection to maximum likelihood 
estimation (App.lDb. 



2. Simulation 

When a charged particle traverses material it loses energy in several discrete steps, dominantly by resonance 
excitations ((5-function) and Coulomb excitations (truncated power-law term). This latter is the reason for the long tail 
observed in energy deposit distributions. 
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Figure 1: Comparison of differential energy loss distributions for 300 jum silicon (left) and 1 cm neon (right), at j3y = 3.17. The probability density 
function (solid) is shown with theory motivated fits (dashed). Above a certain d£/d.v value, indicated by the vertical dash-dotted lines, a power 
function was used, while below that the product of a power and a Gaussian (silicon) or the product of a power and an exponential (neon) was taken. 
For details see App.lCl 



The probability of an excitation, energy deposit, along the path of the incoming particle is a function of Py - p/m 
of the particle and depends on properties of the traversed material. The conditional probability density p{A\t), deposit 
A along a given path length f, can be built using the above mentioned elementary excitations combined with an 
exponential occurrence model. The details of the microscopical simulation can be found in Refs. Jstl, 101 and js]]- 
The result of these recursive convolutions is a smooth asymmetric density distribution with long tails (see Fig. [1] 
solid lines). In order to model detector and readout noise, Gaussian random values with standard deviation of 2 keV 
(0.01 keV) were added to each hit for silicon (neon). For further studies on noise dependence see Sec. 16. II 



3. Truncated mean 

There are several possibilities for the estimation of the differential energy loss of a charged particle. An approach 
using some theoretical model of energy loss would enable to use advanced methods such as maximum likelihood 
estimation. However, especially at startup, particle detectors are not expected to be understood to the degree that 
would enable the use of such estimator The results would be quite sensitive to the choice of the model, precision of 
detector gain calibration, the level of noise and several backgrounds. 

One of the robust and simple estimators is the so called truncated mean that is traditionally used in gas filled 
detector chambers Il6l|7t]. It reduces the influence of high energy deposits in the tail of the energy deposit distribution. 
For this aim a given fraction of the upper 30-60% (and sometimes the lower 0-10%) measurements are discarded and 
only the remaining measurements are averaged with equal weights. Recent studies show that five or even four layers 
of silicon allow to reach 10% resolution, using the truncated mean method [2]. 

Let A, denote the deposit and f,- denote the path length in the active material of the detector, in case of the ith 
measurement of the particle trajectory. The differential energy loss is y,- = A, /?,. The numbering of the measurements 
is such that they are ordered: y, < yi+i, in case of n space points (/ = 1, 2, . . . , n). The estimator is simply 

_ Z"=i Wiyi 
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Figure 2: Correlation matrix of hits (i, J) for 300 jim silicon (left) and 1 cm neon (right), at fiy = 3.17, in case of 20 hits on track. 

E.g. if only the lower half of the hits are used, (0%,50%) truncation, the corresponding weights w, are 



Wi — < 



if2/>« + l 
1/2 if2i = n + l 

1 if2/<n + l. 



In the rest of this paper by simple truncated mean we will mean this (0%,50%) truncation. 
4. Weighted means 

It is possible to generalize this estimator, and optimize the weights, by looking at some measures of its distribution. 
The generalization can be twofold. Instead of a simple truncated mean, the more general weighted mean, linear 
combination, can be examined where the constant weights are allowed to take on different values, not just 0, 1/2 or 1. 
In addition it is possible that the performance of the weighted mean is more beneficial when averaging a monotonic 
function of the measurements jc, - Riyd than just taking the y, values themselves. It is clear that the transformed 
values are also ordered: Xj < Let us look at the linear combination of n measurements 



Y,^i^(yi') ' R(y) = X ^ Yu"^'^' 



V /=1 



(=1 



where - 1- In the following two cases will be examined further. If R is identity we get back the weighted 

arithmetic mean, the use of R(y) = logy gives the weighted geometric mean. While the former choice is historical and 
most simple, the geometric mean has its root in the behavior of energy loss distribution since that can be approximated 
by log-normal distribution. In that sense log(3') seems to have a more symmetric distribution than the long tailed y. 

In fact an optimization of the transformation function R could complete the study, but that task appears to be highly 
non-trivial. Both arithmetic and geometric means are special cases among power functions, R{y) = y'', with p = I and 
p — > 0, respectively: 



Jarith 



^ wiyi, ygeom = exp w,- In y,) . 



Note that extreme cases p - -oo (minimum) and p - oo (maximum) are also included in the weighted arithmetic 
mean. 

3 



4.1. Goal of the optimization 

In order to facilitate clean particle identification and extraction of particle yields, the distribution of the estimator 
should be narrow and should be sensitive to changes in the average energy loss. If the momentum of the particle or 
its mass is altered the distribution of deposited energy will change accordingly. A small change can be modelled by 
multiplying each energy deposit along the trajectory with a factor 1 + cc, where a is small. If the distribution of the 
estimator is close to Gaussian (mean m, standard deviation cr), its mean will shift. The separation power between the 
original and altered distributions, in units of standard deviation, will be 

1 dm 

cr da 

Incase of the arithmetic mean, the estimator is linear, dm/ da = m, thus the relative resolution cr/m has to be minimized 
(see Sec. l4.2l i. For the geometric mean the multiplication corresponds to a shift a, hence the the absolute resolution cr 
is to be minimized (see Sec. 14.31 ). 

The mean of the ith ordered measurement and the covariance of the ith and j\h measurements play a central role 
in the optimization. They are determined as 

mi = (Xi), Vij = (xiXj) - (xiXxj). 

Both m, and Vij can be estimated using detailed physics simulation described in Sec. [J] The correlation matrix for 
silicon and neon is shown in Fig.|2l for a given thickness and /Sy choice, in case of 20 hits on track. Higher deposits 
are strongly correlated, thus they contain less information and should get less weight than other hits. The mean and 
variance of the estimator 2"^^ WiXi (Eq. ([TJ) are 

n n 

m = ^miWi, = ^WiVijWj. (2) 

With help of optimized weights not only the differential energy loss, but also its variance can be estimated, a 
definite advantage over the simple truncated or other plain averages. The weights are of practical use if, for a wide 
range of number of hits on track, they appear independent of or insensitive to /3y values and material thickness. In the 
following it is shown that this is indeed the case. 

4.2. Weighted arithmetic mean 

The task is to minimize the relative resolution 

cr Vw^Vw 
m mw 

by varying the weights w. (Note that here we switched to vector and matrix forms.) The square of this quantity is 



(w^m)(in^w) w^Mw ' 



The term on the right side is a generalized Rayleigh quotient, M = m igi is a dyadic matrix. For the first variation 
dq 

6[w^(V - qM)w] 

oq = 

w^Mw 

A vector w that minimizes q must give 6q -0 and thus satisfy Vw = qMw, which can be rearranged to get 

w = ^(y"'m)(m^w). 
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Figure 3: From top to bottom: normalization factors for 300 ^m silicon and 1 cm neon, at various fiy values; normalization factors in case of 
j8y = 3.17 for silicon and neon, at various thicknesses. Lines are drawn to guide the eye. 



Here the vectors w and V 'm should be parallel. If sum of the weights has to be one, the optimal weights are 



w : 



where 1 is a column vector of ones. It follows that the value of the relative resolution at the minimum is 

1 



(3) 



mm 



\ml 



The sensitivity on the weights could be obtained from the Hessian 

11 = 2 ^"^^ 

(w^m)(m^w) ' 

Since at minimum Hw = 0, according to Cramer's rule det // = 0, so // is singular. The sensitivity on weights, 
corresponding to 1 % increase of the relative resolution, can be demonstrated as 



Aw; = ^10-2 • IqlHii. 



(4) 



4.3. Weighted geometric mean 

The task is to minimize the variance cr^, with the constraint 2; w,- = 1. Thus, the expression to minimize is 

^■(w) = w^Vw + A (l^w - l) . 
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Figure 4: Comparison of differential energy loss distributions for 300 silicon (left) and 1 cm neon (right), at fSy = 3.17. The probability density 
function (solid) is shown with estimators obtained using the siinple truncated mean (dashed) and the weighted arithmetic mean after optimization 
(dash-dotted), in case of 6 hits on track. The horizontal bar indicates the magnitude of noise (2 keV for silicon, 0.01 keV for neon), multiplied by a 
factor 10. 



where A is a Lagrange multiplier. At the minimum there is linear system of equations to solve 



or in matrix form 



2Vw + il = 0, l^w -1=0 



where H is also the Hessian of q. The block matrix H can be inverted 



1 /-y-i -y-iii^y-i y-'i 

H — 



i^y-ii \ i^v-^ -2 

and the equations are solved. The optimal weights are 

y-'i 

while the multiplier is /I = -2/(l^y '1). A comparison of Eq. Q and Eq. (|5]l shows that both expressions have a 
similar structure. It follows that the value of the standard deviation at the minimum is 

min(cr) 



Vi^y-ii 



which is at the same time the relative resolution of the back-transformed exp(x) = y. The sensitivity on weights, 
corresponding to 1% increase of the relative resolution can be obtained. 

4.4. Reseating the weights 

Note that the resulted weights for both the arithmetic and geometric means are functions of the number of mea- 
surements n. Similarly, both the mean and the variance of the estimators vary with n. In order to eliminate the 
dependence of the mean, the weights are renormalized by taking the n — > oo limit as reference 

, w(oo) 

w («) = w(«) ■ — — -. 

m(n) 



Normalization factors for the arithmetic mean are shown in Fig. |3] It is clear the values quickly converge to 1 
as the number of hits on track increases. Even at low hit values (« < 4) the factors are quite independent of Py and 
thickness, this is especially true for silicon. 
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Figure 5: Optimal weights for 300 /jm silicon (upper left) and 1 cm neon (upper right). Values are shown for /?y = 1.00, 3.17 and 10.0. Optimal 
weights at /}y = 3.17 are shown for 300, 600 and 1200 /jm silicon (lower left); 1, 2 and 4 cm neon (lower right). All results are given for tracks 
with 15 hits ((' = 1, . . . , 15). For comparison the optimal weights of the geometric mean are also shown (triangles down and crosses). The lines are 
drawn to guide the eye. 



5. Weighted mean with different path lengths 

In this study we assumed that the path lengths for each hit in the sensitive detector are the same. In case of real 
particle trajectories there is a variance due to bending in the magnetic field, placement of the detector units. The 
energy deposits can be corrected towards a reference path length. The distribution of energy deposit A depends on the 
velocity p of the particle and the thickness of the traversed material t. To a good approximation, the most probable 
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energy loss and the full width of the distribution at half maximum Fa are ^ 



In + 0.2000-^2 - 6 



/2 



Ta = 4.018^ 



(6) 



where 



K .Z t 



is the Landau parameter; K - AnNArlnieC^ - 0.307 075 MeV cm^/mol; z is the charge of the particle in electron 
charge units; Z, A and p are the mass number, atomic number and the density of the material, respectively [ 8]. Let 
us consider the distribution of y - Ajt values for a given particle, that is, at a fix yS. The width of its distribution is 
independent of t (Eq. (|6]l), while the most probable value jp scales with t as 



K ,Z ln(f/fo) 



2 ^ aP- 



where fo denotes a fixed reference thickness. The slight t dependence can be minimized by correcting each measure- 
ment to ^(fo). For that, (3 can be estimated from y, obtained from the deposits without the above discussed path length 
correction. 



6. Results 

Particle identification and yield extraction in the statistical sense are particularly difficult at those momenta where 
the differential energy losses of different type of particles are close. For hadrons, the pion-kaon resolution gets 
problematic above about 0.8 GeV/c, while for the pion-proton case it happens above about 1.6 GeV/c. Hence the 
relevant ySy region is 1 - 10. 

In this work charged particles with fSy - 1.00, 3.16 and 10.0 are studied, with number of hits 2 - 50. Both 
semiconductor and gaseous detectors are investigated: for silicon thicknesses of 300, 600 and 1200 jim, while for 
neon with 1, 2 and 4 cm are chosen. For each study one million particles are used. (In order to speed up computation 
several million hits in 300 //m silicon and 1 cm neon were generated beforehand, for each Py settings, and later 
combined for longer path-length deposits.) 

Comparisons of differential energy loss distributions and the estimators obtained via the simple truncated mean 
and the weighted arithmetic mean after optimization are shown in Fig.|4] in case of 6 hits on track. 

Optimal weights for 300 fim silicon and 1 cm neon are shown in Fig. |5}upper, for several Py values. Optimal 
weights for silicon and neon at fiy - 3.17 are plotted in Fig.|5}lower, for several thicknesses. All results are given for 
tracks with 15 hits, / = 1, . . . , 15. The weights are remarkably independent of fiy and material thickness for silicon, 
while some changes with increasing thickness are seen for neon. In case of silicon the hits 10 < / < 15 have very 
small, in some cases even negative weights, while the lowest deposits have the highest values. In case of 1 cm neon 
the hits 1 < / < 8 have roughly equal relevance, while the rest of the hits is not important. For comparison the 
optimal weights of the geometric mean are also shown. While for silicon there is good agreement with arithmetic 
mean weights, for neon the numbers somewhat differ but they show similar qualitative features. 

The performance of the optimized estimator can be expressed as the ratio of the relative resolutions (weighted over 
simple truncated mean). These are shown as a function of number of hits on track at various Py values (Fig.|6}upper), 
and thicknesses (Fig. |6}lower). It is clear that there is substantial improvement for both silicon and neon, for all Py 
and thickness values. In case of few hits (e.g. n = 3) the resolution decreased by 20-30% with respect to the simple 
truncation. Note that the improvement tends to a limiting value and the relative ratio is steadily below 1 for many 
hits. For comparison the performance of the optimized weighted geometric mean is also plotted. It essentially shows 
a behavior simular to that of the arithmetic mean, although with a better improvement at very low n for neon. 
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Figure 6: Performance of the optimized e.stimator. The ratio of the relative resolutions (weighted over simple truncated mean) is given as a function 
of number of hits on track n: for 300 pm silicon (upper left), and 1 cm neon (upper right), at various ySy values; for silicon (lower left), and 
neon (lower right), at /3y = 3.17 and various thicknesses. For comparison the performance of optimized weighted geometric mean is also shown 
(triangles down and crosses). Lines are drawn to guide the eye. 



6.1. Other considerations 

Detector and readout noise are more important for silicon since their magnitude is higher with respect to the energy 
deposit. Apart from the standard values (2 keV for silicon, 0.01 keV for neon), noise dependence of optimal weights 
was studied (Fig.[8]i. While higher values do not influence the weights for neon, in case of silicon at 10 keV their 
distribution starts to deform into a box distribution. This behavior can be understood: the addition of the Gaussian 
noise softens the lower leading edge of the energy deposit distribution (Fig. [T}left) and makes the lower values less 
important. 

In a more complete detector simulation the effects of readout threshold (underflow, left truncation) and the upper 
limit of detector linearity (right censoring, overflow) should be taken into account. Still, this latter is likely not 
important since the highest deposits in any case will have low weights. 

6.2. Universality, connections 

While the list of weights as function of number of hits can be tabulated (see App. |A]l, it would be much easier 
to find a simpler description. Optimal weights scaled with the number of hits (n ■ w,) as a function of normalized hit 
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Figure 7: Optimal weights scaled with the number of hits (n ■ w,) as a function of normalized hit number ((' - l)/(n - 1), if 4 < « < 20, at fSy = 3.17. 
Values are shown for 300 fim silicon (left) and 1 cm neon (right). The shaded regions indicate a simple description of the weights: a combination 
of linear and constant functions with changes at 0.65 for silicon, and 0.55 for neon. 



numberz = (/ - l)/(n - 1) are plotted in Fig.|7l if 4 < n < 20. The dependence on the normahzed hit number can be 
easily described as a combination of linear and constant functions. For silicon, with zsi ~ 0.65, 



n -Wi — 



\2(zsi- z)/zl- ifz<zsi 
0, otherwise 



(7) 



while for neon, with ZNe ~ 0.55, 



n ■ Wi — 



1/ZNe ifz<ZNe 



0, 



otherwise. 



(8) 



In case of many measurements, the optimal weights can be obtained using the energy deposit distribution, App.lB] 
gives a detailed derivation. 

The simple functional forms described in Eqs. (|7| and ([8]l have a deeper cause, namely it is strongly connected to 
the functional form of the deposit distribution. For detailed argumentation see App. ICl If the density (Fig.[T]| can be 
locally described by 

• a power function, the local weights are zero; 

• a product of exponential and power functions, the local weights are constant; 

• a product of Gaussian and power functions, the local weights are linear in z- 

This study was restricted to linear combination of measurements. It can be shown that while for semiconductor 
detectors the optimized weighted mean estimator may be further improved by using maximum likelihood methods, 
for gaseous detectors the simple (0%,55%) truncation (Eq. dSJ) already gives excellent results (see App.lDl). 



7. Conclusions 

The estimation of differential energy loss for charged particles in tracker detectors was studied. It was shown that 
the simple truncated mean method can be generalized to the linear combination of the energy deposit measurements. 
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The optimized weights are rather independent of particle momentum and material thickness, allowing for a robust 
estimation. Weighted arithmetic and geometric means result in better particle separation power for both semiconductor 
and gaseous detectors. Further inspections showed that weights are deeply connected to corresponding energy deposit 
distribution, allowing for a simple universal description of weights as function of number of hits. 
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Table 1: Optimal weights scaled with the number of hits («■ w,) for 300 ^m silicon, at/Jy = 3.17, in case if hit numbers n = 2, . . . ,9. Errors indicate 
the sensitivity corresponding to 1 % increase of the relative resolution of the estimator 



/ 


n = 2 


n = 3 


n = 4 


n = 5 


n — 6 


n = 7 


n = 8 


n = 9 


1 


2.0 ± 3.6 


3.0 ±2.8 


3.4 ± 0.7 


3.3 ±0.4 


3.3 ±0.3 


3.4 ± 0.3 


3.4 ±0.2 


3.5 ± 0.2 


2 


-0.0 ±0.1 


0.0 ±0.1 


0.7 ±0.1 


1.5 ±0.2 


1.9 ±0.2 


2.0 ±0.3 


2.2 ± 0.3 


2.3 ±0.3 


3 




-0.0 ±0.1 


-0.0 ±0.1 


0.2 ±0.1 


0.8 ±0.1 


1.3 ±0.2 


1.5 ±0.2 


1.7 ±0.2 


4 






-0.0 ±0.1 


-0.0 ±0.1 


0.0 ±0.1 


0.4 ± 0.1 


0.8 ±0.1 


1.0 ±0.1 


5 








-0.0 ±0.1 


-0.0 ±0.1 


-0.0 ±0.1 


0.1 ±0.1 


0.5 ±0.1 


6 










-0.0 ±0.1 


-0.0 ±0.1 


-0.1 ±0.1 


0.1 ±0.1 


7 












-0.0 ±0.1 


-0.0 ±0.1 


-0.1 ±0.1 


8 














-0.0 ±0.1 


-0.0 ±0.1 


9 
















-0.0 ±0.1 



Table 2: Optimal weights scaled with the number of hits (n ■ w,) for 1 cm neon, at ySy = 3.17, in case if hit numbers n = 2, . . . ,9. Errors indicate the 
sensitivity corresponding to 1 % increase of the relative resolution of the estimator. 



/ 


« = 2 


« = 3 


n = 4 


« = 5 


n — 6 


« = 7 


n = 8 


n = 9 


1 


2.0 ± 9.4 


2.8 ±0.8 


2.3 ±0.3 


2.0 ±0.2 


1.9 ±0.2 


1.7 ±0.2 


1.6 ±0.2 


1.6 ±0.2 


2 


-0.0 ±0.1 


0.2 ±0.1 


1.6 ±0.2 


2.0 ± 0.3 


2.1 ±0.3 


2.1 ±0.3 


2.0 ± 0.3 


2.0 ±0.2 


3 




-0.0 ±0.1 


0.1 ±0.1 


1.0 ±0.1 


1.4 ±0.2 


1.9 ±0.3 


2.0 ±0.3 


1.9 ±0.3 


4 






-0.0 ±0.1 


0.0 ±0.1 


0.5 ±0.1 


1.0 ±0.1 


1.5 ±0.2 


1.8 ±0.2 


5 








-0.0 ±0.1 


0.0 ±0.1 


0.3 ±0.1 


0.7 ±0.1 


1.1 ±0.2 


6 










-0.0 ±0.1 


-0.0 ±0.1 


0.2 ±0.1 


0.5 ±0.1 


7 












-0.0 ±0.1 


-0.0 ±0.1 


0.1 ±0.1 


8 














-0.0 ±0.1 


-0.0 ±0.1 


9 
















-0.0 ±0.1 



A. Optimal weights in case of few measurements 

The obtained weights for 300 fim silicon and 1 cm neon, at ySy = 3. 17, in case of hit numbers 2 < n < 9 are shown 
in Tables [T] and |2] respectively. Errors indicate the sensitivity corresponding to 1% increase of the relative resolution 
of the estimator 
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B. Optimal weights in case of many measurements 



A random variable x is described by the probability density function f{x) and cumulative distribution function 
F{x) - J^^f(x')dx'. During an observation / is sampled n times, these measurements are rearranged to increasing 
order (x,- < x,+i, / = 1, 2, . . . , n). We are interested in the means m,- and covariance Vij of the ordered samples in the 
continuous limit (n » 1). 

B.l. Means and covariance of measurements 

The probability density that x, is the ith measurement 

7I 



n\ 



(i- l)!(n-0! 

It is easier to work with - log p, since its minimum gives the most probable value Icj. In the Gaussian approximation 
the mean m,- - Yi and its variance V,, is also calculable. If 1 < i < n, then the factors containing F already constrain 
well enough the position of m,-, hence /(x,) can be approximated by a constant: 

P^^'^ * ^FixiY-'n - Fixi)]"-'. (9) 

(i-l)!(n-0! 

At Xi — mi 

[-logp(x)]' = 0, 1 /Vii^[- log p(x)]". 

By solving the equation on the left, we get 

/ — 1 (/ — l)(n — i) 

F(md = -, Vu = ; ..3,,. \ . (10) 

n - I in - lyj-^inii) 

Note again that the above approximations are valid only for I < i < n. (For the lowest and highest measurements we 
would get mi - -00, m„ = 00, and (T\-cr^- 00.) 

The probability density that x, and xj are the /th and jth measurements (/ < f), respectively, 

p{Xi,Xj) = 



(/-1)!(7- /-!)!(«-;■)! 

■ F(xiY-'[F(xj) - F(xiW-'-'[l - Fixj)r-j. 

Similarly to Eq. (|9]l, by assuming 1 < / < y < n, we have omitted the factors /(x,) and f{xj). 

Close to its minimum /?(x,, xj) can be approximated by a multivariate normal distribution. At the minimum 
(x, - mi and Xj - mj) the correlation coefficient is 



The partial derivatives are 



d log p I log p log p 



d^\ogp ,,2 ;-'- 1 „ .f, 

-^-^ — = (n - 1) — — T-rf{mi)f(mj)) 
dxidxj (j-iY 

d^logp n - 1 



dxj j - i 



f'imd- 



(;■-/)(;■ -!)-(/ -1) ^2^ 
■in -I) — 7-- — / (nii) 



log p n-1 

rf (rrij)- 



dx^j j-i' 



, 1,2 (7 -')(»- ')-(»-;) fi, . 
■ (n - 1) — / ('«;•)■ 

(n - ;)(; - lY 
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In the « » 1 limit the coefficients of /' can be neglected if compared to the ones of the terms. Similarly the 
nominators of the coefficient of // in all three cases can be simplified by neglecting the terms 1, (/ - 1) and (n - j), 
respectively: 

-— — a! (n - 1) ^—.f{mi)f(mj) 



log p 
dx] 



in -If . ^""'^ .. /K). 

(n - ;)(; - 



With that the correlation coefficient is 



Pij = 



a -Din -J) 



" V(n-OO--l)' 

Note that p,y does not depend on /. The covariance of the /th and yth measurements (/ < j) is 

in - lYfimi)f{mj) 

The expression gives nicely back the variance (/ = j) already obtained in Eq. (fTOt-right. 



(11) 



B.2. Optimal weights 

With the vector of means w, (Eq. (fTOll-left) and the covariance matrix Vij (Eq. (fTTT)). the optimal weights are 
calculable. In the high n limit, we can consider the following continuous variables 



i-l 

a = , 

«- 1 

The means and covariance in this variables are 



b = 



n-V 



< a,b < I. 



m(a) - F (a) 
V(a,b) = nnn(a,b)[l - max(a,b)]m'(a)m'{b). 

On the analogy of m = Vw there is a Friedholm integral equation of the first kind for w 



m(a) 



f 

Jo 



V(a, b)w(b)db ■■ 



It can be written in the form 



where 



Jo 



min(a, b)[l - max(a, b)]m'{a)m'(b)w{b)db. 



f 

Jo 



g{a)= Kia,b)y(b)db 



gia) 



m(a) 



y(b) — m'(b)w(b) 



m'(a) 

K{a,b) - mm{a,b){l -max(fl,fo)]. 
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Figure 8: Optimal weights scaled with the number of hits (n ■ w) as a function of normalized hit number (/ - i)/(n - 1), if n is big, at fiy = 3.17. 
Values are shown for 300 )im silicon (left) and 1 cm neon (right), both with several noise settings. 



The kernel K isa special one 



r {I - a)by(b)db + f , 

Jo Ja 

■ f by(b)db+ f ( 

Jo Ja 



g(a)^ j (I - a)by(b)db + \ a(l - b)y(b)db 
?'(a) = - ( by(b)db+ I {I - b)y(b)db 



The equation can be solved by two consecutive derivations in a giving the result 

g"(a) = -y(a). 

In summary the optimal weights w in the continuous variable a are 

m' \m' I 

With the energy loss m as variable, for a weight at F{m) 



(12) 



w\F(rn)\ - —— + m\ '—\ — m— 

f \fl f 



= -Klog/)']'. (13) 



With help of this exact result (Eq. (flJll). the optimal weights scaled with the number of hits (n ■ w,) as a function 
of normalized hit numberz = (i - l)/(n - 1) are shown in Fig.|8] at /Jy = 3.17, if n is big. 
The minimum of the relative resolution of the weighted average is 
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C. Irrelevant and equally relevant measurements 

In the continuous limit we can find probability distribution functions / with special characteristics, by solving 
Eq. dni). 

The ordered measurements are irrelevant if w = 0. The condition 1/m' = would give / = 0, a meaningless 



unphysical solution. On the other hand if 

(-)" = c 

\m' I 



m(z) 
1 



z + d/(b+l) 



i/(i+i) 



f(m) = __ = fl . m*. (14) 
m'(z) 

Hence measurements from distributions with power functions should be neglected, their optimal weights are zero. 
The ordered measurements are equally relevant if 



1 / m \" 
m' \m' I 



c - const > 0. 

m' \m' I 
The equation 

-/[m(z)] — = c 

can be solved by resolving the composition / o m, giving 

/(m) — a-n^ exp(— c ■ m), 

which is the product of a power and an exponential function (compare with Fig. [TJright and Fig. |7]-left, the case of 
neon). Note that c = indeed gives back the result in Eq. (fT4l l. It can be shown that weights are a linear function of z 
if /(w) is the product of a power and a Gaussian function (compare with Fig.[T]-left and Fig.|7}left, the case of silicon). 

The distribution of energy loss /(A) examined in this study has a l/A^ behavior for large energy deposits due 
to Coulomb excitations. (The Landau distribution, which is often used to model and approximate energy loss, also 
has a 1/A- power-law tail.) This is why we got small optimal weights for the upper measurements. It is also the 
fundamental cause for the success of the classical truncated mean method. 



D. Connection to maximum likelihood estimation 

In this work the optimization of differential energy loss estimation was confined to the linear combination of 
the measurements, or of a monotonic function of the measurements. Are there cases when this relatively simple 
prescription is close to the performance of a, supposedly more powerful, maximum likelihood estimation? 

Let us assume that the scale of energy loss distribution is characterized by the most probable deposit yp, such that 
the probability of a deposit is given by 

—f\y — 

yp \ yp) 

where / is a universal function and yo is constant. In that case the likelihood associated to a track with n hits is 



n 



yo . yo 
—f\yi— 
yp \ yp 
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and 



. yp . f yo 

log log/ y, 

yo 



1=1 

Its value is extremum if the derivative is zero, giving 

-yot'--(-lo,fylyM. 

n \ ypi 



yp 



The most probable value yp can be obtained by a simple linear combination of measurements if, for an interval around 
yi, the function (- log /)' behaves as 



(-log/)'(y) ^Ci-bi/y 



where c, and bi are local constants. With this 



yp 



n 

yo y 



Ciyi- 



The corresponding functional form is 



f{y) ^a-y exp(-c ■ y) 



that exactly matches the form found for neon (see Fig.[T]-right and App.O. 

In summary we can say that while for semiconductor detectors the optimized weighted mean estimator may be 
further improved with maximum likelihood methods, for gaseous detectors the simple (0%,55%) truncation already 
gives excellent results. 
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